Kernel Density Estimation

Why KDE?

Source

A major problem with histograms, however, is that the choice of binning can have a disproportionate effect on the resulting visualization.

Intuitively, one can also think of a histogram as a stack of blocks, one block per point. By stacking the blocks in the appropriate grid space, we recover the histogram.

But KDE is better because, instead of stacking the blocks on a regular grid, it centers each block on the point it represents, and sums the total height at each location.

Math

Mathematically, a kernel is a positive function K(x;h) which is controlled by the bandwidth parameter h. Given this kernel form, the density estimate at a point y within a group of points xi;i=1...N is given by:

pκ(y)=NK(yxi;h)

Free parameters of kernel density estimation:

Types of Kernels